skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Langou, Julien"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. When designing an algorithm, one cares about arithmetic/compu- tational complexity, but data movement (I/O) complexity plays an increasingly important role that highly impacts performance and energy consumption. For a given algorithm and a given I/O model, scheduling strategies such as loop tiling can reduce the required I/O down to a limit, called the I/O complexity, inherent to the algorithm itself. The objective of I/O complexity analysis is to compute, for a given program, its minimal I/O requirement among all valid schedules. We consider a sequential execution model with two memories, an infinite one, and a small one of size 𝑆 on which the computations retrieve and produce data. The I/O is the number of reads and writes between the two memories. We identify a common “hourglass pattern” in the dependency graphs of several common linear algebra kernels. Using the proper- ties of this pattern, we mathematically prove tighter lower bounds on their I/O complexity, which improves the previous state-of-the- art bound by a parametric ratio. This proof was integrated inside the IOLB automatic lower bound derivation tool. 
    more » « less
  2. Abstract The QZ algorithm computes the generalized Schur form of a matrix pencil. It is an iterative algorithm and, at some point, it must decide when to deflate, that is when a generalized eigenvalue has converged and to move on to another one. Choosing a deflation criterion that makes this decision is nontrivial. If it is too strict, the algorithm might waste iterations on already converged eigenvalues. If it is not strict enough, the computed eigenvalues might not have full accuracy. Additionally, the criterion should not be computationally expensive to evaluate. There are two commonly used criteria: theelementwisecriterion and thenormwisecriterion. This paper introduces a new deflation criterion based on the size of and the gap between the eigenvalues. We call this new deflation criterion thestrictcriterion. This new criterion for QZ is analogous to the criterion derived by Ahues and Tisseur for the QR algorithm. Theoretical arguments and numerical experiments suggest that the strict criterion outperforms the normwise and elementwise criteria in terms of accuracy. We also provide an example where the accuracy of the generalized eigenvalues using the elementwise or the normwise criteria is less than two digits whereas the strict criterion leads to generalized eigenvalues which are almost accurate to the working precision. Additionally, this paper evaluates some commonly used criteria for infinite eigenvalues. 
    more » « less
  3. Numerical exceptions, which may be caused by overflow, operations like division by 0 or sqrt(−1), or convergence failures, are unavoidable in many cases, in particular when software is used on unforeseen and difficult inputs. As more aspects of society become automated e.g., self-driving cars, health monitors, and cyber-physical systems more generally, it is becoming increasingly important to design software that is resilient to exceptions, and that responds to them in a consistent way. Consistency is needed to allow users to build higher-level software that is also resilient and consistent (and so on recursively). In this paper we explore the design space of consistent exception handling for the widely used BLAS and LAPACK linear algebra libraries, pointing out a variety of instances of inconsistent exception handling in the current versions, and propose a new design that balances consistency, complexity, ease of use, and performance. Some compromises are needed, because there are preexisting inconsistencies that are outside our control, including in or between existing vendor BLAS implementations, different programming languages, and even compilers for the same programming language. And user requests from our surveys are quite diverse. We also propose our design as a possible model for other numerical software, and welcome comments on our design choices. 
    more » « less